Semi-parametric Estimates under Biased Sampling
نویسندگان
چکیده
In observational studies subjects may self select, thereby creating a biased sample. Such problems arise frequently, for example, in astronomical, biomedical, animal, and oil studies, survey sampling and econometrics. For a typical subject, let Y denote the value of interest and suppose that Y has an unknown density function f . Further, let w(y) denote the probability that the subject includes itself in the study given Y = y. Then the conditional density of Y given that it is observed is f∗(y) = w(y)f(y)/κ, where κ is a normalizing constant. The problem of estimating w and f from a biased sample X1, . . . , Xn independently from f ∗ is considered when f is known to belong to a parametric family, say f = fθ, where θ is a vector of unknown parameters, and w is assumed to be non-decreasing. An algorithm for computing the maximum likelihood estimator of (w, θ) is developed, and consistency is established. Simulations are used to show that our method is feasible with moderate sample size, and applications to animal and oil data are given.
منابع مشابه
Semi-parametric efficiency bounds for regression models under choice-based sampling
We extend the Bickel–Klaassen–Ritov–Wellner theory of semi-parametric efficiency bounds to the case of sampling from several populations, and discuss the form of the efficient score and efficient influence function in this situation. The theory is applied to obtain an information bound for estimates of parameters in general regression models under case-control sampling. The variances of the sem...
متن کاملSemi-parametric efficiency bounds for regression models under generalised case-control sampling: the profile likelihood approach
We obtain an information bound for estimates of parameters in general regression models where data is collected under a variety of response-selective sampling schemes. The asymptotic variances of the semi-parametric estimates of Scott and Wild (1986, 1997, 2001) are compared to the bound and the estimates are found to be fully efficient.
متن کاملThe accelerated failure time model under biased sampling.
Chen (2009, Biometrics) studies the semi-parametric accelerated failure time model for data that are size biased. Chen considers only the uncensored case and uses hazard-based estimation methods originally developed for censored observations. However, for uncensored data, a simple linear regression on the log scale is more natural and provides better estimators.
متن کاملBayesian Identification of Semi-Parametric Binary Response Models
In this paper, minimal conditions under which a semi-parametric binary response model is identified in a Bayesian framework are presented and compared to the conditions usually required in a sampling theory framework. Running headline: Semi-parametric Binary Response Models.
متن کاملAccounting for animal movement in estimation of resource selection functions: sampling and data analysis.
Patterns of resource selection by animal populations emerge as a result of the behavior of many individuals. Statistical models that describe these population-level patterns of habitat use can miss important interactions between individual animals and characteristics of their local environment; however, identifying these interactions is difficult. One approach to this problem is to incorporate ...
متن کامل